Skip to content

perf: fix dictionary double-lookups, Collection.Contains, and LINQ allocations#15533

Merged
nohwnd merged 5 commits intomainfrom
dev/amauryleve/perf-fix-dictionary-double-lookups
Mar 26, 2026
Merged

perf: fix dictionary double-lookups, Collection.Contains, and LINQ allocations#15533
nohwnd merged 5 commits intomainfrom
dev/amauryleve/perf-fix-dictionary-double-lookups

Conversation

@Evangelink
Copy link
Copy Markdown
Member

Summary

Fix several obvious performance issues related to data types and redundant iterations across the codebase.

Changes

1. Dictionary ContainsKey + indexer → TryGetValue

Eliminates double hash lookups in ~15 locations. Each ContainsKey + dict[key] pair performs two hash operations when TryGetValue suffices with one.

Files changed:

  • ParallelRunDataAggregator.csGetAggregatedRunStats()
  • TestSessionPool.csKillSession(), TryTakeProxy(), ReturnProxy()
  • ProxyTestSessionManager.csDequeueProxy()
  • InProcDataCollectionSink.csAddKeyValuePairToDictionary(), AddOrUpdateData()
  • PortableSymbolReader.csGetNavigationData()
  • FullSymbolReader.csGetTypeSymbol(), GetMethodSymbol()
  • TestSessionStartArgs.csGetPropertyValue<T>()
  • SessionEvents.csGetPropertyValue<T>()
  • SimpleJSON.cs — indexer getter/setter, Add(), Remove()

2. Collection<InvokedDataCollector>HashSet<InvokedDataCollector>

ParallelRunDataAggregator.InvokedDataCollectors was a Collection<T> with O(n) .Contains() called inside a foreach loop (O(n²) behavior). Changed to HashSet<T> for O(1) lookups. The type already implements IEquatable<T> and GetHashCode(). Updated call sites in ParallelRunEventsHandler and ParallelDataCollectionEventsHandler to wrap with .ToList() when passing to APIs expecting Collection<T>.

3. new[] allocation inside LINQ predicate → static HashSet

In EnableBlameArgumentProcessor.cs, new[] { "CollectAlways", "DumpType" } was allocated per element inside .Where() predicates. Extracted to file-scoped static HashSet<string> fields with StringComparer.OrdinalIgnoreCase.

4. String += → string interpolation

Consolidated two sequential options += calls into a single interpolated string in TestRunnerConnectionInfoExtensions.cs.

Testing

  • Build: ✅ build.cmd -c Release — 0 errors, 0 warnings
  • ParallelRunDataAggregator tests: ✅ 44 passed
  • TestSessionPool tests: ✅ 8 passed
  • InProcDataCollection tests: ✅ 30 passed
  • EnableBlame tests: ✅ 26 passed

Copilot AI review requested due to automatic review settings March 20, 2026 10:11
@Evangelink Evangelink enabled auto-merge (squash) March 20, 2026 10:12
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Performance-focused refactor to reduce redundant dictionary lookups/iterations and cut per-call allocations, with a small concurrency-safety improvement in run stats aggregation.

Changes:

  • Replaced ContainsKey + indexer patterns with TryGetValue to avoid double lookups.
  • Switched invoked data collector tracking to a HashSet to avoid O(n²) .Contains() patterns.
  • Removed a per-element array allocation in LINQ predicates by using static precomputed key sets.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
test/Microsoft.TestPlatform.CrossPlatEngine.UnitTests/Client/Parallel/ParallelRunDataAggregatorTests.cs Updated assertions for HashSet semantics and strengthened thread-safety test by adding a concurrent reader task.
src/vstest.console/Processors/EnableBlameArgumentProcessor.cs Replaced per-element new[] allocations in LINQ filters with static key sets.
src/Microsoft.TestPlatform.ObjectModel/Navigation/PortableSymbolReader.cs Simplified nested dictionary lookups via TryGetValue.
src/Microsoft.TestPlatform.ObjectModel/Navigation/FullSymbolReader.cs Eliminated redundant dictionary lookups using TryGetValue.
src/Microsoft.TestPlatform.ObjectModel/DataCollector/InProcDataCollector/TestSessionStartArgs.cs Replaced double-lookup dictionary access with TryGetValue.
src/Microsoft.TestPlatform.ObjectModel/DataCollector/Events/SessionEvents.cs Replaced double-lookup dictionary access with TryGetValue.
src/Microsoft.TestPlatform.ObjectModel/ConnectionInfo/TestRunnerConnectionInfoExtensions.cs Reduced string concatenation operations by consolidating into one interpolated assignment.
src/Microsoft.TestPlatform.CrossPlatEngine/TestSession/TestSessionPool.cs Reduced redundant dictionary lookups in session pool access.
src/Microsoft.TestPlatform.CrossPlatEngine/TestSession/ProxyTestSessionManager.cs Reduced redundant dictionary lookups when resolving proxy indices.
src/Microsoft.TestPlatform.CrossPlatEngine/DataCollection/ParallelDataCollectionEventsHandler.cs Adapted to HashSet by materializing to Collection<T> for downstream APIs.
src/Microsoft.TestPlatform.CrossPlatEngine/DataCollection/InProcDataCollectionSink.cs Reduced redundant dictionary lookups and simplified update flow.
src/Microsoft.TestPlatform.CrossPlatEngine/Client/Parallel/ParallelRunEventsHandler.cs Adapted to HashSet by materializing to Collection<T> for downstream APIs.
src/Microsoft.TestPlatform.CrossPlatEngine/Client/Parallel/ParallelRunDataAggregator.cs Switched invoked collectors to HashSet, added locking in stats aggregation, and reduced redundant lookups.
src/Microsoft.TestPlatform.Common/Utilities/SimpleJSON.cs Removed redundant dictionary lookups and simplified add/update/remove logic.

@nohwnd
Copy link
Copy Markdown
Member

nohwnd commented Mar 20, 2026

brain is not braining anymore, will have a look on monday, sorry!

var testOutcomeMap = new Dictionary<TestOutcome, long>();
long totalTests = 0;
if (_testRunStatsList.Count > 0)
lock (_dataUpdateSyncObject)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: This was reverted before, this is called once at the end, not in parallel. Same with part of the test reverted before on the bottom.

Won't hurt most likely just sets unrealistic expectations from the api, that will confuse us if we end up replacing the method with other code.

Comment on lines +496 to +508
// Also start a reader thread that calls GetAggregatedRunStats concurrently
var readerTask = Task.Run(() =>
{
barrier.SignalAndWait();
for (int i = 0; i < iterationsPerThread; i++)
{
// This must not throw InvalidOperationException due to collection modification
var runStats = aggregator.GetAggregatedRunStats();
Assert.IsTrue(runStats.ExecutedTests >= 0, "Executed tests count should be non-negative");
}
});

Task.WaitAll(aggregateTasks.Append(readerTask).ToArray());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the part I talk about above. Also the barrier.

@nohwnd
Copy link
Copy Markdown
Member

nohwnd commented Mar 24, 2026

looking

nohwnd added a commit to nohwnd/vstest that referenced this pull request Mar 24, 2026
Address review comments from PR microsoft#15533:

- Revert HashSet<InvokedDataCollector> back to Collection with Contains,
  as this code runs once at end, not in parallel (nohwnd feedback)
- Add clarifying comment to Aggregate method about non-parallel usage
- Revert barrier-based parallel test back to simpler sequential version
  (nohwnd feedback - was reverted before)
- Revert test assertions to index-based access (order is deterministic
  with Collection)
- Change BlameParameterNames from string[] to HashSet<string> with
  OrdinalIgnoreCase comparer for O(1) Contains lookups (Copilot/Youssef)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 24, 2026 07:49
nohwnd added a commit that referenced this pull request Mar 24, 2026
Address review comments from PR #15533:

- Revert HashSet<InvokedDataCollector> back to Collection with Contains,
  as this code runs once at end, not in parallel (nohwnd feedback)
- Add clarifying comment to Aggregate method about non-parallel usage
- Revert barrier-based parallel test back to simpler sequential version
  (nohwnd feedback - was reverted before)
- Revert test assertions to index-based access (order is deterministic
  with Collection)
- Change BlameParameterNames from string[] to HashSet<string> with
  OrdinalIgnoreCase comparer for O(1) Contains lookups (Copilot/Youssef)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.

IDiaEnumSymbols? enumSymbols = null;
IDiaSymbol? methodSymbol;
Dictionary<string, IDiaSymbol> methodSymbolsForType;
Dictionary<string, IDiaSymbol>? methodSymbolsForType;
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making methodSymbolsForType nullable here reduces clarity: it is expected to be non-null in the TryGetValue(...) true-branch, and (based on the structure) likely initialized in the else branch later. To keep nullability accurate and avoid downstream nullability noise, consider using a non-null local inside the if (e.g., out var methodSymbolsForType) and/or ensuring the variable is always assigned a non-null dictionary before any later use.

Suggested change
Dictionary<string, IDiaSymbol>? methodSymbolsForType;
Dictionary<string, IDiaSymbol> methodSymbolsForType;

Copilot uses AI. Check for mistakes.
Comment on lines 387 to 393
if (_methodSymbols.TryGetValue(symbolName, out methodSymbolsForType))
{
methodSymbolsForType = _methodSymbols[symbolName];
if (methodSymbolsForType.ContainsKey(methodName))
if (methodSymbolsForType.TryGetValue(methodName, out var cachedMethodSymbol))
{
return methodSymbolsForType[methodName];
return cachedMethodSymbol;
}
}
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making methodSymbolsForType nullable here reduces clarity: it is expected to be non-null in the TryGetValue(...) true-branch, and (based on the structure) likely initialized in the else branch later. To keep nullability accurate and avoid downstream nullability noise, consider using a non-null local inside the if (e.g., out var methodSymbolsForType) and/or ensuring the variable is always assigned a non-null dictionary before any later use.

Copilot uses AI. Check for mistakes.
Comment on lines +140 to +141
// Note: This method is called once at the end of the test run to aggregate results.
// It is NOT called in parallel, so thread-safety optimizations here would be misleading.
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new note contradicts the XML doc immediately above it: the doc states this must be thread-safe and expected to be called by parallel managers, while the note says it is not called in parallel. Please reconcile these (either update the XML doc to match actual usage, or remove/adjust the note) so future changes don’t rely on conflicting guidance.

Suggested change
// Note: This method is called once at the end of the test run to aggregate results.
// It is NOT called in parallel, so thread-safety optimizations here would be misleading.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note belongs to GetAggregatedRunStats

Comment on lines +75 to +79
if (CollectionData.ContainsKey(key))
{
EqtTrace.Warning("The data for in-proc data collector with key {0} has already been set. Will be reset with new value", key);
CollectionData[key] = value;
}
CollectionData[key] = value;
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This path still does a double-lookup (ContainsKey then indexer assignment). Since this PR targets dictionary lookup reductions, consider switching to TryAdd (warn+overwrite if it returns false) or TryGetValue to keep the warning behavior while avoiding an extra hash lookup on the common add path.

Copilot uses AI. Check for mistakes.
Evangelink and others added 5 commits March 25, 2026 16:35
…locations

- Replace ContainsKey + indexer with TryGetValue across 10+ locations
  (ParallelRunDataAggregator, TestSessionPool, ProxyTestSessionManager,
  InProcDataCollectionSink, PortableSymbolReader, FullSymbolReader,
  TestSessionStartArgs, SessionEvents, SimpleJSON)
- Change Collection<InvokedDataCollector> to HashSet<InvokedDataCollector>
  in ParallelRunDataAggregator for O(1) dedup instead of O(n)
- Extract static HashSets for blame parameter names instead of
  allocating new[] per LINQ predicate evaluation
- Consolidate string += into string interpolation in
  TestRunnerConnectionInfoExtensions
Address review comments from PR #15533:

- Revert HashSet<InvokedDataCollector> back to Collection with Contains,
  as this code runs once at end, not in parallel (nohwnd feedback)
- Add clarifying comment to Aggregate method about non-parallel usage
- Revert barrier-based parallel test back to simpler sequential version
  (nohwnd feedback - was reverted before)
- Revert test assertions to index-based access (order is deterministic
  with Collection)
- Change BlameParameterNames from string[] to HashSet<string> with
  OrdinalIgnoreCase comparer for O(1) Contains lookups (Copilot/Youssef)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove lock and non-parallel note from GetAggregatedRunStats (called
  once at end, not in parallel - lock was reverted before)
- Keep TryGetValue and kvp iteration optimizations
- Restore non-nullable Dictionary<string, IDiaSymbol> in FullSymbolReader

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 25, 2026 15:40
@nohwnd nohwnd force-pushed the dev/amauryleve/perf-fix-dictionary-double-lookups branch from bc0b5f6 to 038b6c5 Compare March 25, 2026 15:40
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.

_runDataAggregator.GetAggregatedException(),
_runDataAggregator.RunContextAttachments,
_runDataAggregator.InvokedDataCollectors,
new Collection<InvokedDataCollector>(_runDataAggregator.InvokedDataCollectors.ToList()),
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This materializes a new List<T> via ToList() and then wraps it in a new Collection<T>, adding allocations on the completion path. If the downstream API can be adjusted, prefer accepting IReadOnlyCollection<T> / IEnumerable<T> to avoid forcing a copy. If the signature cannot change, consider passing a one-time snapshot that is created earlier (or reusing a cached snapshot) so this call site doesn't allocate on every completion.

Suggested change
new Collection<InvokedDataCollector>(_runDataAggregator.InvokedDataCollectors.ToList()),
_runDataAggregator.InvokedDataCollectors.ToList(),

Copilot uses AI. Check for mistakes.
@nohwnd
Copy link
Copy Markdown
Member

nohwnd commented Mar 26, 2026

blocked on flakiness from main

@nohwnd nohwnd disabled auto-merge March 26, 2026 16:29
@nohwnd nohwnd merged commit f427180 into main Mar 26, 2026
2 of 4 checks passed
@nohwnd nohwnd deleted the dev/amauryleve/perf-fix-dictionary-double-lookups branch March 26, 2026 16:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants